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ABSTRACT 

The use of Relational Database Management Systems 
(RDBMS) , a type of microcomputer application software, to analyze 
open-ended survey questions is discussed. Using open-ended questions 
allows researchers to ask respondents to express themselves freely 
about their attitudes and beliefs. This approach also can elicit a 
precise answer even though the list of possible answers is too large 
to list in the instrument or too long for most respondents to read 
(such as student major or home town). The characteristics of an RDBMS 
that allow for analysis of open-ended questions are: flexibility to 
create fields after the datalDase has been designed, ability to join 
databases, and ability to sort on any field in the database. RDBMS 
can handle unstructured data and can use the relational operators 
"join" and "project" when using CONDOR, or the relational operator 
"copy" when using dBASE. The RDBMS can be used to clarify who is 
responding to open-ended questions in surveys, thus making the 
comments more useful, evan in cases of underenumerat ion . The 
underenumerat ion problem can also be approached through effective 
design techniques. Another feature of RDBMS is that it allows the 
creation of a data entry screen. In addition to identifying technical 
considerations related to the use of RDBMS, an example of the 
automated Q-sort is provided. (SW) 
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Abatrmct 



A Relational Data Paae Management Syaten (RDBHS) and good 
inatrunent design can be used together to overcome the problems 
previously associated with analysis of open-ended questions in 
surveys « 

Using open-ended questions allows the researcher to 

*♦ ask respondents to express themselves freely related to their 
attitudes and beliefs^ especially to clarify a closed-ended 
evaluation or to explore a previously unresearched topic, and 

• elicit a precxae answer even though the list of possible 
answers is too large to list in the instrument or too long 
for most respondents to read (such as "student major*' or 
home town" ) . 

The characteristics of an RDBMS which allow for analysis of 
open-ended questions are the 

» flexibility to create fields after the database has been 
designed , 

• ability to join dat^^baaes, and 

• ability to sort on any fxeid in the database . 

This paper describes how CONDOR, an RDBM3, is used to allow 
efficient analysis of open-ended survey questions. 
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Why Vm Umm ClQ«^d-Endad Survv Qu#stiQna 

Higher •ducation raasarchor* U6e surveys to determine the 
» satisfaction of current and former students; 
» satisfaction of employers and transfer institutions; 

# attitudes of students, staff, and the service community 
toward policy issues; and 

• the needs of current and potent ial . students and employers. 

A review of the surveys used in moat of our colleges would reveal 
a strong bias for structured closed-ended questions. Even 
factual information is not requested in an open ended--f ashion in 
many surveys. Instead, respondents are asked to provide their 
occupation code, for example, from a lengthy list of such codes. 

What has inclined ua as researchers to use such structured 
closed-ended questions? I belii^ve there have been two forces at 
work on our thinking and behavior: 

» The desire to make research on human behavior wor^ 
scientific, and therefore to araure that data is more 
quantifiable, and 
» The technological changes which have allowed researchers to 
manipulate large volumes of quantified data on human 
behavior . 



ERIC 



-3- 



Tha former force leade ua to *i»tru»t qu«lit«ti,ve evaluation of 
student, faculty, and service community opinion because it 
Increases our awareness of the role of selective perception and 
evaluation of data. Taken to extreme, this orientation leads us 
to basing change only on the results of wel 1 -designed surveys 
with appropriate controls to prevent undersampling and 
oversampling and statistical error, rather than upon educated 
hwinches about what should happen in a given environment. 

The second force is technological change. Along with the desire 
for more scientifically based decision making regarding human 
behavior, came new developmentflk in computers which allowed 
researchers to automate statistical analysis. 

The researchers' love affair with numeric analysis of human 
behavior heated up just about fifteen years ago when SPSS 
(Statistical Package for the Social Sciences) was first up and 
running on a few university mainframes around the country. Now 
that moat researchers do even their frequency analyses on SPSS or 
SAS (Statistical Analysis System), we require that our data be 
codabie in a form these systems can read - simple numerics. 

This love affair has inclined us toward pre-coded data, which 
requires that the researcher determine in advance which 
categories the respondent will choose. Even post-coding done by 
technical staff is based on codes established 1/ Lhe researcher 
before all responses have been studied. 
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Implicit in thi« nuM«ric orientation ia an aaaumption that 
know how to catagorixa data even before we look at it. In part^ 
thia ia a carryover from reaearch deaigna in which correlationa 
are intended. But, in college aurveya, moat analyaia conaiata of 
purely deacriptive atatiatica: frequenciea and croaa- tabulationa . 
Pre-coded data ia not eaaent iai for deacriptive analyaia. 

In fact, much of our aurvey work in collegea would bo enhanced if 
we, aa reaearchera, would let the data apeak for itaelf * find 
the tone in what reapondenta aay - not juat the frequency with 
which they aay it. 

The open-ended aurvey queation ia one way of allowing for the 
kind of opennesa m collecting data that we need. But, how can 
we ayatematical ly analyze the reaponaea to auch queationa? 
If the queationa are not properly atructured or cannot be 
croaa-tabulated with other cloaed-ended queationa, we will have 
problema with underenumeration, that ia, too few reapondenta 
anawering the queation to be aafe in concluding that their 
reaponaea are repreaentati ve of the total aample (Dillman, 1977). 
For theae reaaona, it ia not aupriaing that some advice givera 
aimply auggeat not uaing open-ended queationa at all (Pride, 
1983) • 
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At l«aat one analytic procea* haa bean developed which allowa fox- 
ainiaal biaa in the review of open-ended queattona. I reaember 
the technique, called Q-aort, froa my graduate school Reaearch 
Methoda cloa*. You may recall how it goea. The reaearcher 
writes all the open-ended atatementa of reapondenta on individual 
3 by 5 carda. Then she reviewa all the carda to let the data 
auggeat a aethod of organization and categorization. When the 
data apeaka to her (ao to apeak) she senda the cat to the neareat 
kennel, atocka the refrigerator, and etarta the days of queation 
sorting <Q-aort) by placing each statement in ita proper pile 
aomewhere on the living room floor. Large pilea are saved for 
future weekends (when the cat is out of the houae) for another 
iteration of the same process. Finally, after weeka of work ana 
an estranged pet, all comments are properly sorted, bound with 
rubber bands, and ready for use in report writing. 

This method may work for graduate students in a reaearch class, 
but it is not very practical in a busy institutional reaearch 
office. If the whole sorting process could be automated, ^uat as 
SPSS or SAS automated all the number crunching just a few years 
ago, researchera would have an avenue for analysis of open-ended 
questions . 

Automating the Q-aort is what this paper is about. 
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Criteria for Umm and Analvl^ of Qpan-tind^d Qu^atlona 



Raa^archftra hava laarnad that opan-anded queatlona ara axtreaely 
valuabla in apacific aattinga aa aummarizad by Dilliian, 19A4: 
*♦ exploratory raaearch whera the objective ia to find the moat 
salient aapect of a topic for use in cloaed-^ended queationa 
in later atudiea 
» when reapondenta need to vent f ruatr«;t ions or state strong 
opinions 

• in partially cloaed-ended queationa whera the explanation of 
the option "other" is desired 

• when it would be unneces:aar i ly time-conauming for the 
respondent to read a long list of possible responses for a 
closed-ended question (ice.^ rake of car)^ and 

» clarifying closed^-ended responses. 



An automated Q-aort should be able to categorize each comment 
after reviewing all the responses. Just as in the manual 
process, the automated approach must be repeatable again and 
again . 
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To determine the impact of underenumeration, there ia a need for 
the automated proceaa to do much more than waa possible in the 
manual setting. Specifically, reaponaea to open-ended questions 
must be cross-tabulated with closed-ended questions to better 
determine the inclinationa of those who make commenta. 

The automated Q-sort needs to be a system flexible enough to 
allow categories to be developed after the data has been 
reviewed. No standard file management system nor hierarchical 
data base management system can achieve these o'fcjectives. 



The type of microcomputer application software which will meet 
the need for analysis of open-ended questions is a Relational 
Data Base Management System (RDBMS) . 

Kruglinkai <1983) characterised RDBMS as a data base product with 
the following featur«a: 

• Allows operations on an entire database with a single command 
» Does not require that all information needs be planned in 

advance. In fact, relationships are specified at the time of 
inquiry rather than in advance 



Ualnq a Relational Data B ase Management Svate m 
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* Contain* the relational operator* "projact'* and ••join". 
••Projacf ia an operation which creates a new relation b> 
•electing a aubaet of the existing relation, and ••join^' 
aimply combines two auparate relations. 

Examples of microcomputer RDBMS are dBASE II and dBASE III, 
R:base, and CONDOR. The author's experience is entirely with 
CONDOR; however, the principles described here apply to any true 
RDBHS whether on a micro or larger computer. 

The characteristics of the RDBMS which allow it to automate the 
Q-sort process are: 

* use of tne relational operators •'join" and ••project^^ (or 
••copy'^ in dB^SE) , and 

» the ability to handle unstructured data . 

An Example of the Automated Q-Sort 

Just as in the manual Q-sort, the first step when using RDBMS is 
to review the data in total to determine what categories present 
themsQlves. In the automated situation, the comments of all 
respondents to any single open-ended question will be evail«ble 
on a screen or in a nicely printed list. The '*magic^^ required to 
avoid the step of creating ail those 3 by 5 cards is described 
in the Data Entry Process section of this paper. 



Lat^a xism as our axanple a factual typa of opan-endad quaation 
from a atudant follow-up atudy: What ia your currant occupation? 
Hara la part of what tha raaearchar would aee on the databaaa 
acraan : 

Accountant 

Accounting techn Icl an 
Administrative Assistant 
Admin Asst 
Artist 

Bookkeeper In spouses 's business 
Commercial fisherman 
Floor nurse, RN 
Sales clerk 
School aid 
Secretary 
. eacher 

The researcher reviews the list of comments and creates 
categories. Once categories are determined, the researcher puts 
a code by each sentence representing Its code in the category 
system. This is a process which can be easily reiterated, so tha 
flrat time through, the coding may be as simple as positive vs. 
negative comments • 




In thia occupation* ttxaAple, thm rM^arch^r might want to cod* 
th% l*v*l of •ducation g«n«rally required for the occupation. At 
thie stage it ie helpfut to make categories consistent and to 
tally the number for occupations with more than one response: 

BA Accountant 

AA Accounting technician 

AA Administrative Assistant 2 

BA Artist 

AA Bookkeeper in spouses^ s business 

HS Commercial fisherman 

AA Floor nurse, RN 

HS Sales clerk 

HS School aid 

AA Secretary 

BA Teacher 

Now - and this illustrates the flexibility of an RDBMS - a new 
field is created in the database for this new code. Because this 
field now exists^ the researcher can do a wide variety of useful 



* List only the occupations which require an associate degree 
level 

Accounting technician 
Administrative Assistant ? 
Bookkeeper in spouses ""s business 
Floor nurse, RN 
Secretary 



things : 
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Q-«ort this nciw Hat ^ gain into subject area catagoriea which 
match tha dagraaa offered at the college. This reiteration 
of the proceee requires the creation of yet another field in 
the database 

Certificate level accounting occupations: 
Accounting technician 
Bookkeeper in spouses ^ a business 
Clerical occupations : 

Administrative Assistant 2 
Secretary 
Allied Health occupations: 
Floor nurse» RN 
Cross tab the occupations with any other question in the 
survey, for example, to find out if moie of one sex tended to 
be in certain occupations, or if those with "personal 
Interest intenf are in different occupations from those with 
"transfer** or "job related" intents 

Personal Interest Students Occupation: 
Accountant 
Artist 

Bookkeeper in spouses'a business 

Sales clerk 

Teacher 
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All of theaa fund iona can ba dona in tha microcomputar , or tha 
naw coda can ba aada nunaric and ^:ranafarrad to a mainfraaa 
atatiatical packaga for analyaia. Tha wicrocomputer mathod is 
uaad whan thara ia a prafarancc for rataining tha axact worda of 
tha raapondant, not juat tha catagory of raaponaa. 

Tha atratagiaa illuatratad abova apply aqually wall to 
aantance-long comaanta from rcapondanta aa they do to factual 
atata»anta« 

InatJUaant Daaion for Ooan-andad Quaationa 

The RDBNS can ba used to clarify who ia responding to opan-endad 
quaationa in surveys, thus making the commenta more uaaful, even 
in cases of underenumeration • The problem of underenumeration 
can also be approached through effective design techniques. 

I find the highest response rate is on partially closed-ended 
questions, those which ask tha respondent to specify the iteaning 
of "other** or to liat something factual. These questions get at 
leaat a S5>< response rate, which ia about the same aa 
cloaad-anded questions. For example, 99X of respondents who are 
employed provide a meaningful response to the occupation 
question . 




Hy •xp#ri#nc« indicatea that, ©xcapt for populationa that are 
h#aitant about aurvaya in ganaral, 75x to 99H will raapond to a 
ganaral opan-andad quaation placad naar tha and of a wall 
daaignad atudy which aaka for apacific racoaaandationa for tha 
prograa/collaga. Thaaa narrativa atatamanta halp to capt ira tha 
ganaral attituda of raapondanta mora claarly than tha 
cloaad-andad raaponaaa. 

To ovarcoaa problaaa of miaundaratanding tha opan-andad quaation, 
I atructura opportunit iaa for commanta throughout tha avaluation 
sactiona of a aurvay. For axaapla, if tha raapondant ia 
avaluating tha halpfulneaa of four atudant aarvica^, thay ara 
offarad apaca to commant on aach sarvica iaaadiataly aftar 
avaluating tha aarvica aa ahown in tha sampla balow: 



JOB KAcEmcnt services 

CoMtnts: 



AOMlSfllONS AND RECORDS 
OFFICE 

(|«ttln| trtntorlpti) 
CoMitnts : 



COURSE CATALOG, SCHEDULE 
4 NEWSPAPER ADS FOR 
REGISTRATION 

Co— tnH '. 



Vt«T HUCM iOWCWiUT VtIT LlTTLt 010 lOT U5t 



iiv mjcii 50HO«UT nit Linn oio nor U3C 



LIT HUCil SOHkVtUT niT LlHLt OtU NUT USt 



WOMEN'S CENTER 

(progrMi for Mn A woien) 

CoMtnts 



fllT HUCU SOHCWriT VtIT ItTTLC OtO ^OT Ulit 



Thia method alao results in underenumerat ion problems aa moat 
respondents do not care to comment on everything, but each 
comment is focused. Additionally, I have found it helpful to 
present the open-ended responses sorted by how the respondent 
commented on the closed-ended question. 
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Th» D^t* Entry Proe^^f 



So far, th« coii»«nt« of re«pond«nt« hava soMahow magically 
arrivad on tha raaaarchar'a computar scraan for raviaw and for 
uaa in tha autoaatad Q-aort. Actually, tha procaaa of data antry 
ia quita aiapla and afficiant. 

Tha RDBMS allowa for craation of a data antry acraan. CONDOR 
allowa for daaign of an aapacially attractiva data antry acraan. 
Kara ia an axaapla of a acraan diaplay for antry: 



^•^••••••••••CON*T AK NATIVE STUDENT SURVEY F84 - SCREEN 

ID 11 REC,4 

13.1 1 13. IA 11111111111111111111111111111111111111111111111111111111111111111 

13.2 1 13. 2A 11111111111111111111111111111111111111111111111111111111111111111 

13.3 1 13. 3A 11111111111111111111111111111111111111111111111111111111111111111 

13.4 1 13. 4A 11111111111111111111111111111111111111111111111111111111111111111 

14 1 15 1 16 1 160TH 11111111111111111111111 11111111111111111111111111111111 
16. A 1 17 11 18 1 19.1 1 19.2 1 19.3 1 19-4 1 19.5 1 19.6 1 19.7 1 
20 1 21 1111111111111111111111111111111111111111111111111111111111111 22 1 
23 1111111111111111111111111111111111111111111111111111111111111111111 



Tha "1" repreaenta apacea on tha sceen (inverse video) which can 
be filled with characters by the data entry worker. Each set of 
"1" ia a data field which may have specifications as to whether 
the entry can be alpha, numeric, or both. If numeric, the range 
of numbers which can be entered can be pre-set. 



Th« long "1" un^t* ar« open-ended comment apacee which follow 
l«aedl«tely after the related cloaed-ended question. The data 
entry person eimply types in the respondent's statement. In the 
rare case of a wordy respondent, the data entry person 
abbreviates the response to fit in the allowed space. 

Once the data entry is completed, the "project" command is used 
to separate the numeric fields from the alpha fields. Numerics 
are transferred to the mainframe computer for standard analysis 
by a statistical package. Two databases now exist: 

• the first in the mainframe with numeric information only, and 

• the second in the microcomputer RDBHS with all the 
ipf ormation , 

One additional advantage o^ the RDBMS iS that data is entered 
only once. The comments of respondents, once entered into the 
database, can be used in the final report without ever being 
typed again. Typos can be corrected in either of two places, l) 
in th*» database on the data entry aceens or 2) when list of 
statements is transferred to a wordprocessing file for final 
report writing. This feature greatly reduces the amount of 
clerical work involved in preparing reports for surveys. 
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Technical ConaldTationa Ralat^d t^g \j mm of RDBM3 



Each RDBMS haa liaitationa on: 

• Nuabar of charactara p^r fiald^ 
Fialda or bytaa par racord^ 

• Scraana per record , and 

• Racorda per databaaa file. 

Thaae character iatica place liaita on the reaearcher which may 
require aoae creative planning. 

0 

In thia review^ I will coaaent only on the limita of CONDOR 20-3. 

• CONDOR allows i27 charactera per field, which iapliea that 
coaaenta of reapondenta could be aa long aa 127 characters^ a 
fairly wordy sentence. Don't believe it forgone minute. 

While it is poasible to enter and list a 127 character field, 
CONDOR is really structured on an 80 column card concept, 
Conaequently , the system places carriage returns at the end 
of every 80 column line. Once the researcher starts 
••joining" and ••project ing^^ a few timea, these unwanted 
carriage returna will create havoc with the comment field. I 
chooae, instead, to use comment fielda of about 65 
charactera. When longer comments are likely, I allow more 
than one field per comment. 

• CONDOR all 127 fialda and 1024 bytes per record. 
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Sines it would hm v nry hard to gat 127 fields on oni» acraon, 
thm fiald limit ia not a problaa. Alao^ ainca CONDOR atoraa 
nuabara very aconoaically , the byta length ia alao not a 
problem • 

• CONDOR ia limited to 1 acreen per record. The one screen per 
record limit ia a problem. Moat lengthy aurveya reguire two 
or more screens for data entry. Fortunately, the "join" 
command allows all theae recorda from different acreena to be 
combined aa needed during data manipulation. 

Each RDBNS will have unique limits and strengths in use for 
survey data entry and open-ended analysis. 



While every survey done in a college setting is unique, much of 
the data manipulation to be done in the RDBHS is repetitious. 
This is where macro type coivmands can be uaed to save 
considerable t ime . 

CONDOR allowa the researcher to program' the PROJECT, JOIN, SORT, 
SELECT, PRINT commands needed for a particular analysis and save 
that work for future sessions which require similar functions. 
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Sine* thm «icrocomput«r RDE^MS worka reaaonably alowly, thia 
progr«»«abl« faatura allowa the raaaarchar to aet up a few 
requeata for information, set the ayatem to work, head for lunch, 
and come back an hour later with pagea of useful analyaia ready 
for final review. 
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